Phase transitions in random Potts systems and the community detection problem: 

spin-glass type and dynamic perspectives 
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Phase transitions in spin glass type systems and, more recently, in related computational problems 
have gained broad interest in disparate arenas. In the current work, we focus on the "community 
detection" problem when cast in terms of a general Potts spin glass type problem. As such, our 
results apply to rather broad Potts spin glass type systems. Community detection describes the 
general problem of partitioning a complex system involving many elements into optimally decoupled 
"communities" of such elements. We report on phase transitions between solvable and unsolvable 
regimes. Solvable region may further split into "easy" and "hard" phases. Spin glass type phase 
transitions appear at both low and high temperatures (or noise). Low temperature transitions 
correspond to an "order by disorder" type effect wherein fluctuations render the system ordered or 
solvable. Separate transitions appear at higher temperatures into a disordered (or an unsolvable) 
phase. Different sorts of randomness lead to disparate behaviors. We illustrate the spin glass 
character of both transitions and report on memory effects. We further relate Potts type spin systems 
to mechanical analogs and suggest how chaotic-type behavior in general thermodynamic systems 
can indeed naturally arise in hard-computational problems and spin-glasses. The correspondence 
between the two types of transitions (spin glass and dynamic) is likely to extend across a larger 
spectrum of spin glass type systems and hard computational problems. We briefly discuss potential 
implications of these transitions in complex many body physical systems. 
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I. INTRODUCTION 

One of the highly significant recent applications of sta- 
tistical mechanics concerns a topic of broad interest- 
that of community detectionjll4l3| in complex networks 
P, 0, [i3l and related computational problems p^l - [l7j . In 
this article, we address further development in the chal- 
lenging quest of studying these difficult computational 
problems by bringing additional tools from physics into 
the fore. Our aim is not only to study the community 
detection problem itself. Rather, we use the community 
detection problem as a platform for a detailed investiga- 
tion of phase transitions [T8I-I2H associated with complex 
computational problems and, generally. Potts spin glass 
type systems. Various applications of physics to com- 
putational problems have enabled significant advances 
in the design of new algorithms and the identification 
and understanding of various "phases" of computational 
problems in way that has dramatically advanced previous 
approaches. 

In this article, we provide direct evidence for earlier 
indications of two phase transitions in the community 
detection problem and more generally in Potts type spin 
glass systems. These Potts type spin glass transitions 
occur at both low and high temperatures (or, similarly, 
at low and high levels of randomness or noise). These 
transitions reflect different underlying physics. Earlier 
reports of such transitions were afforded by information 
theory measures (as in Appendix E of Q) and a com- 
putational "computational susceptibility" to be defined 
in later sections of the current work that monitors the 
onset of a large number of local minima, or large com- 



putational complexity (as in Appendix B of [T3|)- As 
was earlier shown (e.g.. Fig. 11 in [lo|). overlap param- 
eters (to be defined herein) such as the normalized mu- 
tual information /jv exhibit progressively sharper changes 
as the system size N increases. This suggests the exis- 
tence of bona fide thermodynamic transitions. In this 
article, we will investigate "fixed" spin glass type Potts 
Hamiltonian. By "fixed", we allude new spin glass sys- 
tems with fixed parameters which are not dependent on 
the problem itself. Thus, this fixed approach contrasts 
with, e.g., "modularity" P, |^, or other models that 
involved comparisons to random case systems- so called 
"null models" [l|, H, [ll| that have been earlier invoked on 
in the community detection problem. When cast in terms 
of canonical fixed Potts spin Hamiltonian, the system ex- 
hibits sharper phase transitions [l3|- By applying our 
model to a general random graph, we can locate phase 
transitions between solvable and unsolvable regions. Solv- 
able regions may further splinter into "easy" and "hard" 
phases. We further elaborate on disparate phase transi- 
tions (at low and high temperatures) in these rather gen- 
eral Potts spin glass type systems.lt is noteworthy that 
a similar analysis can be done for any other method for 
detecting communities. Within most of the easy phase, 
all of the known methods agree on the solutions. The re- 
sults of our analysis are not relevant to only one specific 
method. 

Insofar as the classification of computational problems, 
the main tools of analysis to date were of a static nature 
and further invoke various forms of "cavity" type ap- 
proximations p3l [2^ and extremely powerful related ap- 
proaches such as "belief propagation" [ij, [25|, [2^ . Cav- 
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ity type methods were of immense success early on in 
studying mean-field type theories in spin-glasses and, in 
the last decade, have seen a rapid resurgence in enabling 
new and very potent algorithms and in better enabling 
an understanding of complex problems. 

In this article, we directly study the phase transi- 
tion in computational problems such as community de- 
tection from both static (i.e., thermodynamic) and dy- 
namic aspects. We directly numerically investigate, sans 
any analytical approximations, thermodynamic quanti- 
ties characterizing the transition augmented by further 
direct measures of the energy landscape of these systems 
by use of a "computational susceptibility" that we will 
introduce later on that monitors the increase number of 
local minima and convergence with local minima. In 
the dynamic approach, in order to relate hard compu- 
tational problems to classical dynamics, we will dualize, 
via a Hubbard-Stratonovich transformation, the original 
(discrete) system to be optimized by a continuous the- 
ory for which equations of motion can be written down 
and the dynamics investigated. By employing these two 
complimentary approaches ((i) static thermodynamic of 
information measures of the energy landscape and (ii) 
classical dynamics) , we correspondingly report on the ex- 
istence of (i) static spin-glass-type transitions as well as 
(ii) dynamical transitions, i.e., the transition of nodes 
from stable orbits to "chaos". The transitions as ascer- 
tained by both approaches occur at precisely the same set 
of parameters describing the problem. As far as we are 
aware, earlier studies have not investigated the general 
phase diagram of this important problem. To date, links 
between dynamical mechanical transitions and spin-glass 
type transitions in computational problems such as this 
have, furthermore, not been discovered. 



II. OUTLINE 

The rest of the paper is organized as follows. In Sec- 
tion |IlTl we introduce the (general) Potts model that will 
form the focus of our attention and its relation to the 
community detection problem. In Section IIVI we intro- 
duce the basic definitions of trials and replicas that are 
imperative to our approach. These allow us to directly 
explore the energy landscape of the system without the 
aid of approximations. This is followed, in Section [Vj 
by a review of information theoretic quantities as they 
pertain to our method. We then proceed to present our 
findings. In Section IVIII we present evidence for the ex- 
istence of spin-glass type transitions that may generally 
occur at both high and low temperatures. We discuss 
the physical origin of these transitions and the relation 
between the phase diagram of the community detection 
problem (and, more generally, that of the Potts model) 
to other important computational problems. In Section 
IVIIIl we relate the Potts model system to a continuous 
mechanical system. By examining its dynamics, we note 
that, in this mechanical system, the transition to chaos 



onsets exactly at the same set of parameters at which the 
Potts model displays spin-glass type transitions. Various 
technical details and further physical aspects have been 
relegated to the appendices. 



III. THE POTTS MODEL 

We will employ a, rather general, spin-glass type Potts 
model Hamiltonian (denoted, henceforth, as the "Abso- 
lute Potts Model" (APM)) [g for solving the community 
detection problem. The Hamiltonian reads 

H{ct) = 5](A,, - 7(1 - Ajma,,a,). (1) 

Here, Aij is an adjacency matrix element which assumes 
a value of 1 if nodes i and j are connected and a value 
of otherwise. The spins {ai}^i attain integer values: 
1 £ < g. Their values reflect the community member- 
ship. That is, if ai = a then node i belongs to community 
number a. The parameter q denotes the total number of 
communities. To simplify the analysis, we will, unless 
stated otherwise, set (the so-called resolution parameter 
[lol |) 7 = 1. (In recent work [27j . we reported on similar 
results for general 7 and weighted version of Eq. ([ij . In 
particular, in physics related applications for many par- 
ticle systems, the weights Aij were determined by the 
two-body interactions |28l. [29|.) 

Although, as we elaborate on in Appendix A, we can 
achieve analytic solutions for certain cases of graphs (e.g., 
employing the cavity method [2^ |30| - [33| when all of the 
nodes are of a fixed degree of fc = 3 or Ising systems 
(i.e., systems with q = 2 communities)), most general 
graphs (with arbitrary degree and cluster size distribu- 
tions) require computer simulation. To this end, we will 
undertake a direct numerical investigation of the system 
at hand without the need to invoke analytical approxi- 
mations or assumptions. Our ("zero-temperature") com- 
munity detection algorithm for minimi zing Eq. ([1]) was 
discussed at length in Refs. 0, [13, In the cur- 

rent work, we investigate the above Hamiltonian of Eq. 
© at zero temperature Q and also at finite tempera- 
tures (T > 0) with the use of a heat bath algorithm 
(HBA) (Appendix B). In brief, within the HBA, we will 
sequentially allow each node an opportunity to change 
the community membership during each time step with 
probabilities determined by a Boltzman weight ^-^^^ 
(/3 = i) at a specified temperature T and the energy 
change (A£') as the node were moved to each connected 
cluster (or to a new cluster). Similarly, as elaborated on 
in more detail in Appendix B, following each step, we 
further allow the possibility of community merges based 
on a Boltzman weight. 
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FIG. 1: A caricature of the information theory correlations 
(springs) between "replicas" (denoted symbolically by balls) 
in a high dimensional energy landscape (in our case, graph 
partitions or Potts spin configurations). "Replicas" are ob- 
tained from multiple solutions of the same problem (in this 
case, the minimization of the Potts model Hamiltonian of Eq. 
([T])). The information theory correlations measure the agree- 
ment or overlap between the candidate solutions ( "replicas" ) . 
In earlier works and in the current work, we use such corre- 
lations to ascertain system parameters (e.g., 7 of Eq. lU) for 
which clearly defined solutions appear. Throughout most of 
the current work, we will not employ inter-replica correlations 
but rather the average of the correlations between all of the 
replica and a known (or "planted") solution to the commu- 
nity detection problem (a minimum of the Hamiltonian) . For 
detailed definitions of replicas and information theory corre- 
lations, see Sec. llVl and Sec.|Vl respectively. 
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FIG. 3: The average variation of information V of the "noise" 
Pout (the density of links connecting different communities). 
V is calculated between the proposed solution and the embed- 
ded constructed sample graph whose solution is known. (The 
graph has a power-law distribution of community sizes with 
a minimum rimin = 8, maximal Uraax = 40, and with the ex- 
ponent determining the community size distribution set equal 
to — f ). We show results obtained by using our absolute Potts 
model (denoted as "APM" in Eq. U])). For coinparison, we 
also plot the results determined by "RB Potts" [3, model 
and modularity optimization ("Q-opt") @] using simulated 
annealing. With the "APM" , our algorithm demonstrates ex- 
tremely high accuracy for the small and large systems shown 
above. 



IV. DEFINITIONS: TRIALS AND REPLICAS 
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FIG. 2: A schematic of the physical content of parameters 
that we employ: (a) The convergence time r is the num- 
ber of steps the algorithm needed to reach local minima, (b) 
When the energy landscape becomes complex, more "trials" 
are needed in order to veer towards the global minimum (or 
minima). This requisite number of trials relates to the "com- 
putational susceptibility" x of Eq.© that as will be explained 
later records the improvement in the quality of the solutions 
(as seen by the normalized mutual information In) as the 
number of trials s (different trajectories in panel (b)) is in- 
creased. 



Before turning to the specifics of our results, we need to 
introduce several basic notions. We start by discussing 
two concepts which underlie our approach. Both con- 
cepts pertain to the use of multiple identical copies of 
the same system which differ from one another by a per- 
mutation of the site indices. In the definitions of "trials" 
and "replicas" given below, we build on the existence of 
a given algorithm (any algorithm) that may minimize a 
given energy or cost function. In our particular case, we 
minimize the Hamiltonian of Eq. ([T]). However, these 
ideas and concepts are more general. 

• Trials. We use trials alone in our bare community 
detection algorithm [^, • We run the algorithm on the 
same problem "s" independent times. This may gener- 
ally lead to different contending states that minimize Eq. 
([1]). Out of these s trials, we will pick the lowest energy 
state and use that state as the solution. In the current 
work, 4 < s < 20. We will canonically employ s = 4 
trials. We will use s > 4 trials in the calculation of the 
computational susceptibility of Eq.([7]). 

• Replicas. Each sequence of the above described s tri- 
als is termed a replica (see the schematic plot Fig. [1] of 
replicas ). When using "replicas" in the current context, 
we run the aforementioned s trials (and pick the lowest 
solution) "r" independent times. By examining informa- 
tion theory correlations between the "r" replicas and the 
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known (or "planted" ) solution, we can assess the quality 
of candidate solutions. In this work, we set r = 100. 

In this work, we will briefly remark on the determina- 
tion of optimal parameters of the system. To this end, we 
will compute the average inter-replica information theory 
correlations within the ensemble of r replicas. Specif- 
ically, information theory extrema as a function of the 
scale parameters, generally correspond to more pertinent 
solutions that are locally stable to a continuous change of 
scale. It is in this way that we will detect the important 
physical scales and parameters in the system. 

In this work, we will compute the average informa- 
tion measures between the disparate candidate solu- 
tions found by different "replicas" and the known (or 
"planted" ) solution to the problem which we label below 
as "K" . In general, with A denoting graph partitions 
in different "replicas" and Q{A,K) denoting the infor- 
mation theory overlap between replica A and the known 
solution K, the average for a general quantity Q that we 
will employ are, rather explicitly. 



iQ) 



Vg(A,i^). 



(2) 



In earlier works [i,!!!-!!^ we employed the average inter- 
replica information theory overlaps. We will invoke this 
method once when discussing the optimal value of the 
resolution parameter 7 of Eq. ([T} . Apart from that single 
case, will generally not use these average inter-replica 
measures here but rather their comparison to a known 
solution K. 

In the context of the Potts model Hamiltonian of Eq. 
dl]), by "replicas", we allude to systems that initially 
constitute identical copies of the system that differ only 
by a permutation the Potts spin label. Different replicas 
will, generally, lead to disparate final contending solu- 
tions. By the use of an ensemble of such replicas, we can 
attain accurate result and determine information theory 
correlations between candidate solutions and infer from 
these a detailed picture of the system. 

These definitions might seem fairly abstract for the 
moment. We will fiesh these out and re- iterate their def- 
inition anew when detailing our specific results and in- 
voked information theory based correlations to which we 
turn next. 



V. INFORMATION THEORY AND 
COMPLEXITY MEASURES 

In this section, we introduce and review information 
theory measures (see the schematic plot Fig. [T] depicting 
the information theory correlations) (as they pertain to 
the community detection problem) that we will employ 
in our analysis. 

• Shannon Entropy. If there are q communities in a 
partition A, then the Shannon entropy is 



a=l 



(3) 



The ratio ^ is the probability for a randomly selected 
node to be in a community a with Ua the number of 
nodes in community a and N the total number of nodes. 
With the aid of this probability distribution the Shannon 
entropy of Eq. ([3]) follows. 

• The mutual information. The mutual information 
I{A, B) between candidate partitions [A and B) that are 
found by two replicas is 



^(Ai^)=i:f:^iog,"-^ 



a=l 6=1 



riant 



(4) 



Here, Uab is the number of nodes of community a of 
partition A that are shared with community h of partition 
-B, qA/qs is the number of communities in partition A 
(or _B), and (as earlier) Ua (or nif) is the number of nodes 
in community a (or b). 

• The variation of information. 

The variation of information < V{A^ B) < log2 N 
between two partitions A and B is given by 



V{A, B)^Ha + Hb- 2/(A, B) 



(5) 



• The normalized mutual information. The normalized 
mutual information < In{A, B) < 1 is 



In{A,B) = 



2IiA,B) 
Ha + Hb' 



(6) 



High /jv and low V values generally indicate high 
agreement between different the partitions (or general 
Potts spin configurations) A and B. 

The physical significance of two of the following con- 
cepts is sketched in Fig. [2l 

• The convergence time. The convergence time r is the 
number of the algorithm steps needed to find the local 
minimum following a greedy algorithm. As just noted 
above, a schematic plot explaining the physical meaning 
of the convergence time t is shown in Fig. [21 

• The complexity. The complexity customarily de- 
noted as S(e), can be derived from the number of states 
ms) with energy E. Specifically, N{E) ~ exp[AfE(e)], 
[l9| where the energy density e = E/N. In this work, 
we will numerically determine the onset of the high com- 
plexity (which probes the number of local minima) with- 
out any prior assumptions or approximations by directly 
computing the "computational susceptibility" ([Toj) that 
we will briefiy define next. 

• The ''computational susceptibility". 

A "computational susceptibility" monitoring the onset 
of high complexity can be defined as: 
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(a)The Variation of information V as a function 
of the inter-community link density pout- Note 
that y for the q = 140 system rapidly increases 
from zero at precisely pout = Pi = 0.2. At a 
value of Pout = P2 = 0.24, V exhibits a much 
more gradual increase (whence curves for 
different values of q cross). 




(b)The Shannon Entropy H versus pout- H 

starts to increase at precisely pout = pi . 
Beyondpout = P2 , the entropy monotonically 
decreases and veers towards a universal curve 
appearing for all values of q. 



Xn = In{s = n) - In{s = 4). 



(7) 



That is, X is the increase in the normahzed multure in- 
formation /tv as the number of trials (number of initial 
starting points in the energy landscape) s = n is in- 
creased. Physically, we ask how many different initial 
starting points in the energy landscape (i.e., how many 
different initial "trials" ) arc required to achieve a certain 
desired threshold accuracy as measured by information 
theory measures. 



VI. NOISE TESTS 

Similar to [s^ , we will use a "noise test" benchmark as 
a workhorse to study phase transitions in random graphs 
i- 

We define the system "noise" in community detection 
problem as edges that connect a given node to com- 
munities other than its original community assignment 
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(c)The convergence time t (following a greedy 
algorithm) to a local minimum (as shown in 
panel (a) of Fig. [2]l exhibits a sharp maximum 

at the transition between the easy and hard 
phases at precisely pout = pi . The hard phase 
is marked not only by a large convergence time 
to local minima t but rather by a large 
complexity (a high degree of metastable 
minima). This leads to a more difficult 
convergence to the global energy minimum 
(requiring many trials to achieve the desired 

accuracy (see text)). At pout = P2i the 
convergence time collapses onto the universal 
curve appearing for all q (for high pout)- 
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(d)Thc "computational susceptibility" x of Eq. 

JTt (as shown in panel (b) of Fig. ^ versus 
Pout for different trial numbers. This quantity 

monitors the complexity or number of 
metastable local minima. Note that x increases 
from zero at precisely pout = Pi . The 
computational susceptibility markedly 
diminishes for Pout = P2- As is evident here, a 
higher number of trials (a higher number of 
starting points in the high energy energy 
landscape) is required in order to achieve ever 
more accurate solutions. 

FIG. 4: Plots of various measures as a function of the noise 
level Pout . y is the variation of information. Ji" is the shannon 
entropy ([l3|). t is the number of steps needed to reach local 
low energy state (see also Fig. [6|. The "computational sus- 
ceptibility" X is defined in Eq. 0. In the examined system 
of N — 2048 nodes with q = 140 communities, all of the plots 
show three phases as noise varies. (1) Below a noise thresh- 
old value of pi — 0.2, the system can be "easily" solved. (2) 
When 0.2 < pout < 0.24, the benefit of extra trials is most 
significant (shown in (c)) and it is "hard" to solve the system. 
(3) Above noise levels about p2 = 0.24, the system cannot be 
perfectly solved. As we will outline, the two transitions at 
Pout ~ Pi,P2 are both of the spin-glass type. 
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( "inter-community" edges) . In general , we cannot ini- 
tially distinguish between edges contributing to noise and 
those constituting edges within communities of the best 
partition(s). 

Specifically, for each constructed benchmark graph, we 
start with TV nodes divided into q communities with a 
power law size distribution (with the exponent determin- 
ing the community size distribution [s^l set equal to (—1), 

i.e., the community size n scales as with /? = —1). 
We connect all "intra-community" edges at a high aver- 
age edge density pin = 0.95. This, when pout = we 
have decoupled clusters with no inter-community links. 
We then add random "inter-community" edges ( "noise" ) 
at a density of pout < 0.5. Specifically, pin is defined 
as the ratio of the existing intra-community edges over 
the maximal intra-community edges, and Pout is defined 
as the ratio of the existing "inter-community" edges over 
the maximal inter-community edges. If we denote the 
average external degree for each node by Zout (i-C-, the 
average number of links between a given node to nodes 
in communities other than its own) and the average in- 
ternal degree by Zin (i.e., the average number of links to 
nodes in the same community- Zin + Zout ~ Z with Z 
the average coordination number), then we define Q 



NZ,,, 



ELl '^a(?^a - 1) 



(8) 



and 



NZo 



Pout 



a=l l^b=ia 



nam 



(9) 



In the above, as throughout, denotes the number of 
nodes in community a. 

When the noise is low (i.e., when pout is small), all 
the communities arc well defined. As more and more ex- 
ternal links are progressively added to the system [pout 
increases), the communities become harder and harder 
to detect. In some stage, when the external link den- 
sity is efficiently high, the system cannot be detected. 
As alluded to earlier, we investigate the phase transition 
from the "solvable" to "unsolvable" at both the low and 
high temperature with the use of the heat bath algorithm 
("HBA" in Appendix B) in the following section. 



VII. SPIN GLASS TYPE TRANSITIONS 

A. Results for information theory correlation and 
thermodynamic quantities 

With all of the preliminaries now in place, we now 
report our findings. The upshot of the results to be pre- 
sented is evidence for the existence of two spin glass type 
transitions in general random graphs. Evidence for these 
transition is afforded by changes in the accuracy of the 



solution obtained by the "APM" in Eq. ([T|) when noise 
is introduced. This is shown in Fig. [31 The variation 
of information V between the test system result and the 
solution displays a phase transition as the noise Pout in- 
creases. A transition is also manifest in the sudden jump 
of V . The variation of information V remains zero (in- 
dicating, essentially, perfect solutions) up to a threshold 
value of the noise where a very sharp transition is seen. 
We compared this transition to similar transitions that 
we detected via more standard, methods. These are la- 
beled, in Fig. [21 by "Q-opt SA" (maximization of mod- 
ularity (Q), set by a comparison to a null model, Q as 
solved by simulated annealing (SA)) and "RBPM" (the 
Potts model of [ll| wherein the parameters in the Hamil- 
tonian are also defined by a null model). As seen, our 
"APM" of Eq. m (which is free ^ of the so-called "res- 
olution limit" [ij, [3a] that appears in systems with null 
models) can be used to examine graphs with high levels 
of noise By comparison to other models compared 
to null models, the APM exhibits a sharper transition as 
the number of nodes N is increased [l^ . 



B. General features of the phase diagram as 
ascertained by numerical data 

As is evident in Fig.Ul there are three different phases. 
We denote these phases by the qualifiers of (i) "easy", 
(ii) "hard", and (iii) "unsolvable". These three phases are 
the analogs of the three phases (i)"SAT", (ii)"hard", 
(in)"unSAT" in the k-SAT problem In later dis- 

cussions, we elaborate on their possible physical signif- 
icance of these phases in disparate arenas such as that 
of supercooled liquids. In what follows, we first present 
our results. We first discuss the zero temperature case 
(T = 0) and then explore the physics at T > 0. 



1. T = 

In Fig. m the low noise region {pout < Pi) is seen to 
be in the "easy" phase. In this phase, the accuracy {V), 
entropy (H), and the computational susceptibility (x) 
are constant. Within this regime, the algorithm is able to 
correctly distribute nodes into their correct communities. 
We test several systems with different system size N and 
number of communities q, and in Appendix E, we plot 
the first transition point pi in terms of N and q. 

As the noise pout is further increased beyond a thresh- 
old value oipi, the system enters the "hard" phase. The 
existence of the "hard" phase is rcfiected by the rapid 
growth (decrease) in the entropy and computational sus- 
ceptibility (accuracy) curves. Even though, we can in- 
crease the number of trials in order to improve the accu- 
racy of our solutions (as seen in panel (c) of Fig. [J]), it is, 
nevertheless, still hard to obtain exact solutions. 

As the noise is yet further increased and exceeds a 
second threshold value {pout > P2), the system undergoes 



(a) The computational susceptibility of 
Eq- ^x{T, Pout) as a function of the 
heat bath temperature T and the level 
'noise" Pout (density of inter-community 
links) for the system with N = 2048 
nodes and q = 140 communities. 




(b)The normalized mutual information 
lN{T,Pout) for the same system. 



another phase transition from the "hard" phase to an 
"unsolvable" phase. This "unsolvable" region is reflected, 
amongst other things, by the cohapse of aU of the curves 
in each panel of Fig. In this regime, it is impossible 
to solve the system correctly without infinite time in the 
third region. 



2. T > 

A more detailed, higher dimensional perspective, that 
includes the effects of temperature is provided in Fig. 
[5l In this figure, summarizing our results for the 
computational susceptibility x[T,Pout), Shannon entropy 
H{T,pout), normalized mutual information lN{T,Pout) 
and system energy E{T,pout) at general finite tempera- 
tures r > 0, we plot the loci of point marking the bound- 
aries between the different phases. The "flat" phase 
that lies in the middle of these panels is the "easy" 
phase. [Within the "easy" phase, the system is easily 
solvable and the planted communities are perfectly de- 
tected.] This "easy" phase is separated by "ridges" of 
high computational susceptibility (marking the "hard" 
regions) from the "unsolvable" phases. As expected, the 
computational susceptibility/energy /entropy //a? exhibit 



(c)A plot of the energy E[T,pout)- 

The energy here is an ensemble 
average energy over 100 replicas at 



time t = 1000. 
H 




2.S O-O 

(d)Plot of Shannon entropy 
U{T,pout). 



FIG. 5: The computational susceptibility x, normalized mu- 
tual information Jjv, Shannon entropy H and energy E in 
terms of temperature T and the inter-community noise pout 
for systems with A'^ = 2048 nodes and q — 140 communi- 
ties. All of the plots show three different phases which corre- 
spond to the three panels ((a)-(c)) shown in Fig.[6l denoted as 
"hard-easy-hard". The first "ridge" in the low temperature 
in panel (a)-(d) (computational susceptibility x/normalized 
mutual information I ^ /energy _E/entropy H) corresponds to 
the "hard" phase shown in panel (a) in Fig. (6] A higher tem- 
perature hard phase is also present. A guide to the eye is 
drawn to emphasize the manifestation of the hard phases in 
all measured quantiities. The middle "flat" region in panels 
(a)-(d) is the "easy" phase. 



a precipitous jump as the noise Pout exceeds some thresh- 
old value Pi{T). A low temperature hard phase appear 
for noise levels pi (T) < Pout < P2 {T) . We can determine 
the boundaries of the "hard" phase, whenever it generally 
exists, by seeing for which values of pout and T there is 
a rapid increase of x and E. An additional high tem- 
perature bump in the computational complexity x and 
E appears for noise levels psiT) < pout < P4{T)- In this 
phase, the minimization of Eq. ([T]) is non-trivial. At yet 
higher temperatures/noise levels, it is generally impossi- 
ble to solve the system. Thus, the two loci of "ridges" in 
the computational complexity (i.e., Pi{T) < pout < P2{T) 
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Trapped in the local minima at zero temperature 




FIG. 6: A caricature of the accessible energy landscape at 
different temperatures for a system, such as that examined in 
Fig. ([Sjl with a fixed noise level pout which slightly exceeds 
pi(r = 0). In panel(a), at zero temperature, the system is 
trapped in local minima. Panel(b) shows the system at tem- 
peratures that are sufficiently high for the system to anneal 
and better access regions in the vicinity of the lowest energy 
states. This situation corresponds to the intermediate region 
that lies between the two "ridges" in Fig. [S] Panel(c) shows 
the system in a high temperature phase where, thermal fluc- 
tuations are exceedingly large and the system does not veer 
towards low energy states. 



or psiT) < Pout < Pa{T)) delineate the "hard" phases. 
To emphasize the appearance of this ridges and their 
manifestation in all measured quantities, a guide to the 
eye is drawn. Within the low temperature hard phase 
{pi(T) < Pout < P2{T)), the system becomes trapped in 
the local energy minima (panel (a) of Fig. ([5])). At low 
temperatures, wc find from the exact and extensive nu- 
merical calculations (as shown in panel (d) in Fig. |4]), a 
very dramatic increase in complexity just at the transi- 
tion pi followed by a much more gradual decrease up to 
P2. The convergence time for a local greedy algorithm 
(such as ours shown in (c) of Fig. |4]) does not correlate 
with the complexity as the system. This is so as the 
system can easily converge to a wrong local metastable 
minimum (while the number of such minima is given by 
the complexity). 

In Fig. ini we provide caricatures of the underlying 
physics in these phases and the low temperature/low 
noise transitions. At low temperatures, for noise Pout 
slightly above pi (at zero temperature), the system be- 
comes quenched in metastable local minima at low tem- 
peratures. This is schematically illustrated in panel (a). 
As the temperature is increased, the system may, as de- 
picted in panel (b) of Fig. |6l veer towards its global min- 
imum by annealing. Physically, a similar mechanism is 
at work in many frustrated physical system where it goes 
under the name of "order by disorder" . In such cases, by 
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FIG. 7: The normalized mutual information In as a function 
of Pout for system = 2048 at temperature T = 0. The noise 
levels pi and p2 are the first and the second transition points 
for the particular displayed system of A = 2048 and q = 140. 
The inferred values of pi = 0.2 and p2 = 0.24 are consistent 
with Fig. 21 The normalized mutual information Jjv records 
the overlap between the "important partitions" (the optimal 
partition corresponding to the lowest energy state of Eq. (O) 
and the contending partitions found by the algorithm. 
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FIG. 8: A comparison of the normalized mutual information 
In as a function of noise pout between two cases: (i) one with 
a fixed resolution parameter 7=1 (see Eq. ([T|) and (ii) a 
computation with the optimal 7 determined by the maximal 
In /minimal V (minimal variation of information) . No change 
in the transition points pi and p2 occurs by optimizing 7 in 
this zero temperature system. Indeed, for this system 7=1 
is the optimal value of 7 for noise levels pout < P2- The two 
curves start to separate for higher noise levels. 



virtue of entropic fluctuations, quenching is thwarted and 
the swtem may probe low lying states and indeed order 
[36l - l39l |. Thus, the energy and computational susceptibil- 
ity may remain constant (there is only one global energy 
minimum, i.e., one state or a finite set of such states). 
However, it does take progressively more time to locate 
the global minimum state ((c) in Fig. 2]). As the noise is 
further increased, the system is still ergodic. However, it 
takes a very long time to find the lowest energy state. On 
finite time scales, the system stays in the vicinity of local 
minima thus yielding a higher observed energy. Only on 
sufficiently long time scales does the system veer towards 
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FIG. 9: The energy _E as a function of the temperature T for 
a system with A'^ = 512 nodes, 5 = 40 communities, and noise 
Pont = 0.32. (This noise level exceeds the zero temperature 
pi = 0.29 for this system.) We perform a computational 
experiment at T = 2.5 and lower the temperature according 
to Tfe+i = 0.95rfc in consecutive time steps k. After a steady- 
state is obtained, the process is reversed. A clear hysteresis- 
like effect is evident. 

its global minimum (or minima). Within this "hard" re- 
gion, there are many metastable states. This leads to a 
significant increase in the complexity as is made evident 
by the rapid growth of the computational susceptibility 
X of Eq. ([7]). The large computational complexity marks 
the initial rapid climb of the complexity. 

We now return to the results of Fig. [5] at yet higher 
temperatures and values of the noise Pout ■ The high tem- 
perature "ridge" in Fig. [5] (pa (T) < pout < Pi{T)) corre- 
sponds to the system being far away from the minimum 
energy state. As we remarked earlier, this delineates yet 
another "hard" phase. According to the above explana- 
tion and the corresponding caricature of Fig. [SI increas- 
ing the running time and/or number of trials should help 
increase the accuracy of the solution in this region (the 
peak area of the computational susceptibility). Beyond 
this region, at higher temperatures, the system is unsolv- 
able. This corresponds to panel (c) in Fig. 

At low temperatures and high noise, due to the pro- 
liferation of metastable states, (i) the convergence time 
T (as seen in panel (c) of Fig. ^) can be low while (ii) 
the increase in accuracy by performing more and more 
trials is, essentially, nil [as seen by the low value of x in 
panel (d) of Fig. [5] . Similar conclusions can be arrived 
at finite temperatures by examining constant T slices of 
x{T,Pout)- 

We now examine, in further detail, several aspects of 
these transitions at T = 0. The (zero-temperature) nor- 
malized mutual information is displayed in Fig. [T] As 
evident from the figure, /jv starts to drop below its max- 
imal value of /at = 1 (which indicates perfect agreement 
with the optimal solution) when pout = Pi (i-e., at the 
very same value of the noise Pout = Pi where the relax- 
ation time is maximal and the complexity increases) and 
In levels ofi^ at a higher value of the noise Pout = P2 
(coincident with the transition value as ascertained from 
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FIG. 10: The autocorrelation function (Eq. (|10|l ') as a func- 
tion of time for system of N — 512 nodes consisting of g = 40 
communities witli a noise level of pout ~ 0.4. The waiting 
time tw = 100 and the temperature T = 0.2. The four dis- 
played curves represent four different initializations for the 
studied system. "Symmetric" initialization means that each 
node forms its own community, so there are A'^ communities 
as a starting point for the algorithm. "Random" means ran- 
domly filling go communities with nodes, where go is a random 
number generated between 2 and -y. "Power law distribu- 
tion" means separating A'^ nodes into different communities, 
whose size satisfy power law distribution with a negative ex- 
ponent, set to be /3 = —1, —2. also, the maximal community 
size is set to be 50, the minimal community size is 8 in the 
above simulation results. At low temperature (T = 0.2), all 
of the curves with different initialization separate from each 
other even up to times of f = 10000 steps. As this figure 
makes clear, different sorts of randomness lead to different 
behaviors. 

the energy, entropy and complexity in Fig. 2]). Amongst 
other collapses that we observed, systems with differing 
number of communities q all collapse onto at Pout = P2 ■ 
Before wc turn to a more detailed analysis of spin-glass 
character of the transitions, we make one remark. A pos- 
sible concern is that wc did not examine transitions the 
optimal value of 7. Indeed, the central thesis of @ was 
that there are optimal values of 7 that signify the natural 
scales in the system. In general, transitions as a function 
of 7 correspond to transitions in structure that appear as 
the system is examined on larger and larger scales as we 
have examined in detail in earlier works [27l - l29| . To 
ascertain the changes that occur in the random systems 
that we investigated in this article for a broad spectrum 
of different values of 7 (i.e., containing general 7 ^ 1), 
we re-investigated these systems with 7 values within the 
range 10~^ < 7 < 100. The "best " values of 7 are ascer- 
tained by maxima of the normalized mutual information 
In M- In Fig-! ^e display In as the function of the 
noise pout for both the fixed 7 = 1 and the optimal 7 
determined by the multiresolution algorithm. The first 
transition point pi is the same in both cases, and the 
two curves start to separate around the second transi- 
tion point p2. This indicates that, as it so happens to 
be in this case, 7 = 1 is the best value of resolution pa- 
rameter for noise levels below p2 in this example system 
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{a.)pout = 0.22 is within the low temperature 
"hard" region, where the collapse is perfect. 



(c)pout = 0.25 is around the second transition 
point, where the collapse becomes fainter. 




(b)pout = 0.24 is around the second transition 
point, where the collapse starts to wane. 



{N = 2048, q = 140) at zero temperature. 

C. Numerical validation of the spin glass character 
of the two transitions 

The proliferation of metastable states thwarts equih- 
bration. A specific facet of this is detailed in Appendix 
F wherein, by energy measurements, the lack of equili- 
bration at short times is evident. As is well appreciated, 
this absence of equilibration due to multiple metastable 
states may lead to spin-glass-likc (as well as structural 
glass like) properties. Amongst other traits, these in- 
clude memory effects previously studied for other systems 
[40I l4l| . When a spin glass is cooled down, a memory of 
the cooling process is imprinted in the spin structure, and 
this process will be reproduced if one heats the system 
up. 

We conduct a similar computational "experiment" . 
We immerse our system (with a fixed value of the noise 
Pout) in a heat bath. We then lower the heat bath tem- 
perature T by small increments at consecutive time steps 
k. (Each time step corresponds to a single iteration 
through all nodes according to the minimization algo- 
rithm of [i,[l3l-) In this case, we set Tk+i = 0.95Tk. Af- 
ter attaining a steady-state solution, we then reverse the 
process and increase T after each step via T^+i = LOST^. 
In Fig. [SI we plot the long time system energy E as a 
function of T during this process. The energy curve as T 



(d)poiit = 0.28 is within the "unsolvable" region, 
where the collapse is poor. 

FIG. 11: A validation of the spin glass character of the low 
temperature hard phase. We show a collapse of the auto- 
correlation curves for the for different waiting times for 
a system with A'^ = 2048 nodes, q — 140 communities, and 
Pout varies from 0.22 to 0.28. The first and second transition 
points for this system are pi — 0.2 and p2 ~ 0.24. The heat 
bath temperature is T = 0.1 in all these panels. The vertical 
axis is g{t)C{tw,t) where g{t) = 8 — logiQ(i). The horizon- 
tal axis is u{t^,t) = j^[{t + t^i)^'^ — tlT'^] where /i = 0.1. 
(See text.) The noise pout = 0.22 in panel (a) lies within 
the "hard" region where the collapse of correlation function 
is perfect. The noise values of Pout = 0.24 and Pout ~ 0.25 in 
panels (b) and (c) respectively are around the second transi- 
tion point, where the collapse becomes fainter. The noise of 
Pout = 0.28 in panel(d) is above the second transition point 
P2-i.e.-in the "unsolvable" region, where the collapse becomes 
very poor. That the collapse of the correlation function starts 
to degrade right after the second transition point p2 at low 
temperature indicates that this transition is of the spin-glass 
type. 



decreases follows a different path than when T increases 
which strongly implies a hysteresis-like effect. This mem- 
ory effect as the temperature is cycled between high and 
low T reinforces the similarity between the community 
detection and a spin glass system. 

The behavior of the energy displayed in Fig.|9]suggests 
the same three regions that we ascertained earlier: (i) 
When the two curves overlap at low temperatures (i.e., 
T < 0.1), the system is in its "frozen phase", (ii) When 
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(a)The systems has a noise value of pout = 0.21 

and is at a temperature T = 1.3. With these 
parameters, the system is in the hard phase (or 
the region of soaring computational 
susceptibility in Fig. O. Here, the collapse is 
nearly perfect. 
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u(t.tj 

(b)poui = 0.3 at temperature T = 1.3 is around 
the boundary of the hard phase. The collapse 
starts to lose its perfection. 



the two curves separate in a medium temperature range 
(i.e., 0.1 < T < 2.5), the system is in a"spin-glass" phase, 
(iii) At yet higher temperature (T > 2.5), the two curves 
overlap once again. This marks the onset of the "disor- 
dered" high temperature regime. 

As illustrated in Figs. (|4|5p (and as will be further dis- 
cussed in Figs. (|llll2p ). the hard phases at both low and 
high temperatures do not extend over all temperatures. 
Rather, as we have emphasized above, the hard phases 
only appear in the "complexity" ridges as shown in panel 
(a) of Fig. O However, in Fig. [HI the hysteresis occurs 
in the temperature range 0.1 < T < 2.5. This range 
is considerably larger than that of the hard phases. To 
understand this, we remark on the "experimental" differ- 
ences between the results displayed in Fig. [5] and those in 
Fig. ini In constructing the 3D plot of the "complexity" 
(panel (a)) in Fig. [5l we apply the "HBA" at each tem- 
perature. The systems at different temperatures are in- 
dependent of one another. That is, each system is solved 
afresh from the symmetric initial state. In the hysteresis 
loop in Fig. |9]on, e.g., the decreasing temperature curve, 
a system at higher temperature provides the initial state 
for a lower temperature system. Thus, in this case, the 
systems at different temperatures are not independent 
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(c)pout = 0.35 at temperature T = 1.3. Here, 
the system is outside the hard phase. In this 
case, the collapse is poor. 
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(d)pout = 0.4 at temperature T = 1.3- far away 
from the hard phase. The collapse is 
non-existent 

FIG. 12: An illustration of the spin glass character of the high 
temperature hard phase. Shown is a collapse of the autocor- 
relation curves for different waiting times for the system 
of N = 2048 nodes and q = 140 communities. The heat bath 
temperature is T = 1.3 in all panels. In this collapse (see 
text), the vertical-axis is g{t)C{tiu,t) where g{t) — 8 — logj^Q(f) 
and the horizontal-axis is u{t^,t) = Y^[{t + t^u)^^^ — tliT^] 
where jj. = 0.1. Panel(a) of pout ~ 0.21 is within the 
high temperature hard phase (evident as the higher temper- 
ature "bump" in the 3D plot of computational susceptibility 
x{Pout,T) (Fig. [5|). Within the hard phase, the collapse is 
perfect. Panel(b) oi Pout = 0.3 is around the boundary of the 
hard phase. Correspondingly, the collapse starts to lose its 
precision. Panel(c) corresponds to pout = 0.35- outside the 
hard phase. A poor collapse is seen. Panel(d) corresponds to 
Pout ~ 0.4 is far from the hard phase. No collapse is seen. The 
collapse of the auto-correlation function loses its perfection 
right after the second transition point p4 at high temperature 
indicates that this transition is also of the spin-glass type. 



but rather serve as "seed" states for one another. 

Aspects of the memory effect are evidently not limited 
to those of; e.g.. Fig. [9l For instance, if we incorporate 
the effects of increasing and decreasing noise to the same 
system instead of temperature, the accuracy of the 
solution also forms a hysteresis loop at low temperature 
(see Appendix C). Similar to a real spin glass system, the 
magnitude of this effect also decreases as the temperature 
increases and finally disappears beyond a threshold tem- 
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perature. 

A general quantitative measure of the memory, the 
two-time autocorrelation function between the system at 
times and time t + t^, 

1 ^ 
1=1 

can be used to explore the spin-glass-like behavior. The 
upshot of the below discussion is that the autocorrelation 
function data only within the hard phases (both at low 
and at high temperatures- coincident, as emphasized ear- 
lier, with the "ridges" in Fig. ([5])) adheres to a spin-glass 
type collapse. This affirms, once again, the spin glass 
character of the transitions. 

If wc apply the HBA starting from different initial 
configurations at low temperature (as elaborated on in 
Appendix D), all of the auto-correlation curves with dif- 
ferent initializations separate from each other even up to 
times t [in units of the iteration through all nodes accord- 
ing to the algorithm of [l,[l3l] as large as t = 10000 (Fig. 
fTU)) . This indicates that disparate sorts of randomness 
can, generally, lead to different results. As the tempera- 
ture T is increased, all of the curves ultimately collapse 
onto one another. The temperature at which the dif- 
ferent initial configurations overlap indicates when the 
respective systems start losing memory of their initial 
configurations directly relates to the transition temper- 
ature in the hysteresis loop for the same system. This 
further establishes the existence of spin glass transition 
in the community detection problem. 

We use the HBA starting from a symmetric initial state 
and calculate the autocorrelation in Eq. (|10p for different 
waiting times t^ and temperatures T. We further found 
that each auto-correlation curve C(t,tw) corresponding 
to longer waiting time t^ lies above those with shorter 
waiting times, and all the curves (with different waiting 
times) are non-zero for a long period of simulation time 
indicating a memory effect. Moreover, we can predict 
the long time behavior of C{tyj,t) by fitting the curves 
using a commonly- used equation in Fig. ([TTl [T^ . for 
more details, see j4l.l45|. 

Towards this end, we set 

5(t) =a-&logio(i), (11) 

and 

w(t.,t) = 7^[(t + U'-^-tirn- (12) 

In the above equations, a, 6 and ^ are parameters that 
need to be optimized in order to ascertain whether a 
generic spin glass type collapse occurs [3, • In search- 
ing for a collapse of the data points at different wait- 
ing times i^, we use g{t)C{tw,t) as a vertical-axis and 
u{tyj,t) as a horizontal-axis. As seen in Figs. ([TTl [T^ . a 
collapse indeed occurs over 4 decades in values of w(tu,, t) 
in both the high and low temperature hard phases. 



Wc discuss several features of this collapse and its coin- 
cidence with the hard phase below. Fig. [TTl corresponds 
to the low temperature hard phase and Fig. [T^] corre- 
sponds to the high temperature hard phase [see, e.g., the 
3D computational susceptibility plot xiPout,T) in Fig. 
[5]. As seen in Figs. (|ll [ I12p . both the high and low tem- 
perature cases, the autocorrelation functions with differ- 
ent waiting times t^ exhibit spin-glass collapse when the 
value of Pout lies within the "ridge" area of the hard 
phases. This collapse wanes when pout veers towards 
the "foot" of the complexity ridge just at the onset of 
the hard phase. The collapse ultimately becomes non- 
existent when Pout is further away from the "ridge" area. 
The regime where the correlation functions satisfy the 
spin-glass collapse is consistent with the parameters cor- 
responding to the hard phase (or "ridge" in the 3D com- 
putational susceptibility plot of Fig. (O). Putting all of 
the pieces together, we see from our scaling and collapse 
in Figs. ([TTl [T^ . that both high and low temperature 
transitions of the spin-glass type. 

In the random graphs, we reported on spin-glass type 
transitions. Although trivial, for completeness, we should 
however note that a graph can, obviously, also be very 
regular. A prototypical example is that of the two- 
dimensional square lattice For such regular unfrus- 
trated lattice systems, the Potts model of Eq. (|T|) be- 
comes the "standard" Potts model of lattice systems. In 
these instances, we generally have single first or second 
order transitions instead of spin-glass type transitions. 
We briefiy elaborate on this point. Simple regular lat- 
tices are a particular realization of a graph (one with the 
fixed coordination and translational symmetry). As is 
well known, on, e.g., the square lattice, the Potts model, 
which we use throughout, exhibits as a function of the 
temperature T, two phases with an intervening critical 
point for small q {q < 4); for larger q (q > -i), a first 
order transition appears. Thus, particular realizations of 
our hamiltonian for these graphs display (usual) critical 
points and first order phase transitions. For more generic 
random graphs with high coordination, the system dis- 
plays (as wc showed above and will further elaborate on) , 
spin-glass type transitions appear along with intervening 
hard phases. 

We further reiterate an earlier remark and note that in 
systems with well defined structures on multiple scales, 
additional transitions may appear as the resolution pa- 
rameter 7 of Eq. (|T]) is varied. In earlier works, we 
reported on these transitions and further employed these 
in the analysis of disparate systems [1, [27l - [29| . 



D. General discussion 

In this subsection, we detail general considerations di- 
rectly related to the spin-glass Potts analysis thus far. In 
the next section, we will further discuss dynamics which 
further relates to aspects that we detail herein. This 
subsection is different from others in that here (and only 
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in this subsection), we present a general discussion and 
some speculations and not present data. 



1. Theoretical Expectations from NP-completeness 

In [4^, it was shown that maximizing modularity (an 
earlier alluded to prominent approach for the commu- 
nity detection problem 1, 8, 2^]) is NP-complete. Thus, 
as all NP complete problems may (by their very defini- 
tion) be mapped onto one another, maximizing modular- 
ity on the most general graphs must span the three phases 
(solvable and unsolvable with the further division of the 
solvable problems into the "easily solvable phase" and 
the "hard phase") that appear, e.g., in k-SAT problem 
which is known to be NP-complete 47 1 . Similarly, if 
other approaches to community detection are, ultimately, 
equally hard as maximizing modularity, then all of these 
approaches may in general display three phases. It may 
be, as in the k-SAT problem, that for simple problems, we 
have only an "easy phase" and an "unsolvable" (or "un- 
SAT") phase. This does indeed occur for some graphs. 
In general, though, we find the three different phases (as 
expected) that we reported on in this work. 



2. Physical content of the transition in many body systems 
• Approximate decoupling 

We briefly speculate, in this subsection alone, on po- 
tential physical consequences of the phase transition that 
we find in the community detection problem. As elabo- 
rated on in [2^ [2^ a general many body system with 
two particle interactions may be regarded as a network 
with edge weights determined by the interactions. In the 
easy phase in the extreme limit of Pout = 0, the system is 
essentially that of disjoint non-interacting clusters. This 
point is analytically connected to any other point in the 
easy phase. More generally. The Potts model Hamilto- 
nian Eq. ^ can be written as: H{{a}) = J2k=i^k- 
Thus, the partition function becomes: 



1 q 



En(E ' 

{A} fe=l {o-i}(Efc 



Edl^^)- (13) 



{A} fc=l 



In Eq. (|13p . Zk is the partition function as computed 
with the Hamiltonian of the entire system for the par- 
ticles in community k, and {A} denote partitions of the 
system. 

A similar form was proposed for many body systems 
in p8| when partitioning a general interacting system 
into decoupled clusters. Even though, we sum over all 



partitions, we may have an important subset of parti- 
tions, denoted as {A'} (each with a corresponding num- 
ber of clusters equal to gA' ) , which will have in general in- 
stances, high Boltzmann weights and/or frequencies and 
will dominate the sum. These partitions will, correspond- 
ingly, have a significant lower free energy relative to other 
partitions. In such cases, the partition function can be 
approximated as 
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Z : 



{A'}c=l 



(14) 



Eq. p4| is exact in the limit of T = where A denotes 
the ground-state(s) of our Potts type Hamiltonian. If 
Pout is small (in particular if pout ~ 0), then there will 
generally exist a small number of sharply defined ground- 
states {A'} pertaining to partitions into completely dis- 
connected communities. This general trend of dominant 
subsets may persist within the easily solvable phase. 

The possible upshot of this discussion is that we 
might, in easy phases, approximate many body inter- 
acting systems (such as supercooled liquids that we will 
briefly discuss next) as effectively composed of disjoint 
non-interacting clusters. This picture may badly break 
down once transition lines between the easy phase and 
the hard or unsolvable phases are traversed. 

• Possible relation to structural glasses and other 
complex physical systems 

Glasses (according to the theories such as the ran- 
dom first order transition theory of glass (RFOT) in 
[4^) may have three phases as a function of tempera- 
ture. In the intermediate phase, the system displays a 
large complexity (as manifest in the configurational en- 
tropy being extensive). If we replace the interacting par- 
ticles in a supercooled liquid (that form a glass at low 
temperatures) by decoupled communities [28l . [29| , then 
the three phases found in the computational community 
detection problem may be manifest as three disparate 
phases of supercooled liquids as a function of tempera- 
ture. Within RFOT, at temperatures in an intermedi- 
ate region (Tq < T < Ta), the system physically dis- 
plays an extensive configurational entropy (which is tan- 
tamount to an extremely large complexity in the current 
context). This configurational entropy precipitously on- 
sets at T = Ta and gradually diminishes until it no longer 
becomes extensive a lower temperature {T = Tq) whence 
the system freezes into an "ideal glass" that is perma- 
nently stuck in a metastable state. 

We will discuss, in Section lVlIII dvnamical aspects that 
directly relate to the Potts model Hamiltonian. Insofar 
as additional general related aspects of the results of our 
community detection analysis the implication of phase 
boundaries, we make a brief comment. When, as dis- 
cussed in [2^, [2^, a weighted version of Eq.(IT]) is used 
with edge weights that are set by forces then in over- 
damped viscuous systems (where the total force on a 
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particle is proportional to its velocity, fi = cvi), par- 
ticles that experience a similar total force, will tend to 
move in unison. Thus, in the easy phase motion of de- 
coupled cohesively moving particles will occur. In the 
unsolvable phase, the particle motion will be more com- 
plicated. In earlier work, forces were used to study com- 
munity detection with, overall, similar results to the one 
afforded by our spin glass approach in this work [l^ . 
When other weights are used (such as potentials, two- 
body correlations or other metrics), similar decoupling 
within the solvable phase signifies a tendency of the clus- 
ters not to be related insofar as the metric being used. 



3. Image segmentation 

Recently, we investigated and invoked the features of 
the phase diagram in order to address the computer vi- 
sion problem of detecting objects in general images [l^l 
(including notably challenging ones). As in in this work 
for random graphs, by varying parameters such the tem- 
perature, the (graph) resolution parameter 7 and physi- 
cal length scales, we explored the community detection 
phase diagrams for image segmentation. Within the easy 
phase, disparate objects were clearly seen. As the system 
moved into the hard phase, the sharpness of the objects 
became more fragmented. These ultimately became very 
noisy in the unsolvable phase. 

In summary, whenever a decomposition of an inter- 
acting many body system into nearly decoupled commu- 
nities is possible (indeed, as alluded to above, such a 
decomposition is exact for Potts model systems wherein 
the exchange energy between spins in different domains 
is zero) then the phase transitions that we report on here 
for the community detection problem may carry direct 
physical consequences. This may afford a direct link be- 
tween the phase diagrams of hard computational prob- 
lems (as ascertained by physically inspired approaches) 
and the phase diagrams of physical systems that may be 
investigated via solutions to these related computational 
problems. It is important to note that the direct relation 
between complexity and glassiness is not simple as some 
problems that may be investigated by sub-optimal algo- 
rithms (such as physical stochastic systems) may appear 
to have a "hard phase" while if investigated by a more 
efficient algorithm do not have a "hard phase" [4^ . Nev- 
ertheless, it may well be possible that the decomposition 
of physical systems into simple elements will no longer be 
simple at the onset into complex states such as those of 
supercooled liquids. Indeed, in recent work, we applied 
the community detection ideas to general many body sys- 
tems (including glasses) in order to flesh out prospective 
important structures on all scales (27l - [29j . 



VIII. DYNAMICAL ASPECTS 

In the following, we also study the related dynamical 
transition. Dynamic approaches to community detection 
have been suggested earlier [l^, [131 • To describe the dy- 
namical process, we need to calculate the trajectory (of 
community memberships) for each node as a function of 
time. Specifically, we use the correspondence between the 
q-state Potts model and a clock- type model in (g — 1) di- 
mensions. We replace the Kronecker delta 6{ai, aj) in Eq. 
([T]) by a product - rij where fii and fij are the vertices of 
a regular {q — l)-dimensional simplex. On such simplifies 
(e.g., an equilateral triangle (g = 3), tetrahedron (q = 4), 
...),ni-nj = [l + l/{q — l)]Sij — l/{q — l). Thus, as is well 
known, we can cast the Hamiltonian of Eq. ([l} into the 
form H = - Y,ij ^ij^i ■ fij where A'^j = (1 + - 1 

is the interaction weight. If we insert an external field hi 
into this simplified Hamiltonian, then it becomes 

H ^ -'^A'ijfii ■ fij -^hi ■ Hi. (15) 

ij i 

In what follows, we will first outline a very simple 
new general method for relating a general statistical me- 
chanics system (such as the particular Potts model un- 
der consideration) and a dynamical system from classical 
mechanics. Although, this method will be specifically in- 
voked to the Potts model, all of its steps can be replicated 
for other systems as well. We will then proceed to show 
the results of our numerical analysis. The final result of 
our analysis is that the spin glass transitions relate to 
transitions to chaos in the dynamics of the continuous 
mechanical system. 

A. Relating discrete Hamiltonians to continuous 
dynamics 

We will in this subsection illustrate how it is possible 
to relate the discrete Potts model Hamiltonian of Eq. ([T]) 
[and its clock model variant of Eq. to mechanical 

system with continuous dynamics. Many possible similar 
variants of the method outlined below are possible. Al- 
though our present aim is to investigate the Potts model 
Hamiltonians, as noted above, our method can be ap- 
plied mutatis mutandis to general discrete Hamiltonians. 
A benefit of this mapping is that it bridges chaos in the 
more standard mechanical sense to that reported in spin 
glass systems. 

Starting with Ea. (jl5p . we perform a Hubbard- 
Stratonovich transformation via non-compact auxiliary 
fields ff to arrive at the effective Hamiltonian (or, more 
precisely, free energy) 

= -InZ 



15 



where Z is the partition function. 

The dynamical equation for a node moving under the 
effective field is, for a damped system, given by 



dt 



5H, 



eff 



Sffi 



hi=0 



-1 Sn; 



(17) 
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Wc initialize the auxiliary field ffi to be some constant 
vector close to 0. We can solve this dynamical relation 
to obtain the non-compact auxiliary field rf as a function 
of time 

We can obtain the expression for nodes trajectories 
(fii) in terms of time by taking the derivative of the par- 
tition function Z's [Eq. ([T5|] with respect to the source 
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(b)pout = 0.3 



S\nZ{hi} 



hi=0 



(18) 



Substituting fj in Eq. ([TT]) into Eq. (P7)) , we can deter- 
mine the trajectory of the nodes. 



B. Numerical results for the continuous dynamical 
analog 



FIG. 13: Plots of node trajectories (fti) as a function of time 
t (number of algorithm steps). The tested system has N = 24: 
nodes, g = 4 communities, and is solved at a temperature 
of r = 0.05. According to the description in the text, Ui 
is a g — 1 = 3 dimensional vector. In each plot, the three 
different Cartesian components of (fti) marked by different 
colors (shades). Node i is picked randomly from the 24 nodes. 
In panel (a), the noise Pout = 0.2 is below the transition point 
Pi = 0.28. In panel (b), pout ~ 0.3 is above pi. Note that 
panel (a) shows a convergent solution for node i where panel 
(b) indicates the absence of a collapse. 



Eq. P?)) describes overdamped (or Aristotelian) dy- 
namics. It is, of course, possible to also define the system 
in such a way that it evolves according to Newton's equa- 
tion. In overdamped systems, the energy of the system 
goes down with time and thus the system veers towards 
a local (or global) energy minimum. The system exhibits 
no dynamics once it gets stuck in a local (global) mini- 
mum of the energy. For the shown system in Fig. [T3] , 
in the absence of perturbing fields, at low noise within 
the solvable region the system, the node coordinates {Hi) 
quickly collapse to the origin. Conversely, at high val- 
ues of the noise (i.e., large Pout) the node coordinates 
do not converge (and indeed, as we elaborate on below, 
the system is not solvable). As detailed in Appendix G, 
we further applied weak perturbing fields {hi} and found 
that they can indeed veer the system, at low noise, to- 
wards the correct solutions. 

The system shown in Fig. [13] contains only N — 24 
nodes with q = A communities at a temperature T ~ 
0.05. Prior to investigating this system using the dy- 
namical approach outlined above, we first examined this 
system also using the entropy/energy /computational sus- 
ceptibility measures discussed in this article and found 
that, in this system, there is no hard phase. Rather, 
there is a direct transition (or, more precisely, crossover 
in this small system) from an easy solvable phase for 



Pout < Pi = 0.28 to a disordered unsolvable system for 
p > Pi. The result of our dynamic analysis following 
Ea. ([27|) . demonstrates the existence of a phase transi- 
tion (or crossover for this finite N system) at precisely 
the same values of Pout found by the analysis of the ther- 
modynamic quantities associated with the Potts model for 
this small system. In this case, the dynamics of the nodes 
illustrate that when pout exceeds pi , the system exhibits 
a transition from a stable system (p < pi) to one which 
is chaotic (p > pi). Our dynamic approach to the com- 
munity detection transition may generally bridge such 
transitions in system dynamics to thermodynamic phase 
transitions. 

We illustrated via our dynamic approach, how ergodic 
behavior can arise depending on pout (and, similarly, also 
on temperature). This relates to "chaotic" behavior re- 
flecting the sensitivity to the temperature and in our case 
other parameters (such as Pout) that define the compu- 
tational problem in spin-glasses [s^ - ls^ to real chaotic 
behavior of a dynamical system. Further, in our spin- 
glass approach, Fig. [15] illustrates that auto-correlation 
functions corresponding to different initial conditions (or 
randomness) remain different up to long times. This sen- 
sitive dependence on the initial conditions is the hallmark 
of chaotic systems. Although, we have not observed such 
an intermediate hard phase for the small N system that 
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we investigated using this dynamic approach, we specu- 
late the above dynamic transition from more stable orbits 
to "chaos" may, for larger systems, exhibit also indeed an 
intermediate region corresponding (pi (T) < pout < P2 (T) 
for low T or also psiT) < pout < Pi{T) for higher T) 
where more and more branching points may appear (or 
period doubling, etc.) as the system transitions into 
chaos. Ideas from KAM analysis may, hopefully, be in- 
voked in more sophisticated treatments. 



IX. CONCLUSIONS 

We reported on disparate high and low temperature 
spin glass type phase transitions in the community de- 
tection problem and, by extension, rather general dis- 
ordered Potts spin systems. Our investigation involved 
several complementary approaches and was not confined 
to systems with a small number of Potts spin flavors or 
communities. In the community detection setting, simi- 
lar to other computational problems, phase transitions 
occur between a solvable and unsolvable region. The 
solvable region may further split into an "easy" and a 
"hard" region. We illustrated how thermal "order out of 
disorder" may come into play in these systems and pro- 
vided ample evidence of the spin-glass character of the 
transitions that occur. Amongst other results, we found 
that different sorts of randomness can lead to different 
behaviors, e.g., "chaos". We introduce a general corre- 
spondence between discrete spin systems and mechani- 
cal systems with continuous dynamics. With the aid of 
this mapping, we illustrated that spin glass type transi- 
tions in the disordered system correspond to transitions 
to chaos in the mechanical system. The mapping that 
we use to relate the thermodynamics to the dynamics 
suggests how chaotic-type behavior in thermodynamical 
system can indeed naturally arise in hard-computational 
problem and spin-glasses. We further briefly speculate on 
possible physical consequences (such as supercooled liq- 
uids and glasses) of the transitions that we find here. Re- 
cently, we indeed employed the transitions that we found 
here in the analysis of such complex physical systems 
[28I [29} as well as image segmentation [g^j . 

Acknowledgments. This work was supported by 
NSF grant DMR-1106293 (ZN). We also wish to thank S. 
Chakrabarty, R. Darst, P. Johnson, B. Leonard, A. Mid- 
dlcton, M. E. J. Newman, D. Reichman, V. Tran, and L. 
Zdcborova for discussions and ongoing work. 

Note added in Proof: Some time after the initial 
appearance of the current work (ssj and earlier reports 
of particular aspects of a phase transition (Appendix E of 
and Appendix B of [l3|), the authors of Ref. [s^ in- 
vestigated phase transitions in the community detection 
problem on sparse graphs of small q and reached similar 
conclusions as we have for general graphs with larger q 
values. 



Appendix A: Theory analysis of the community 
detection problem 

In this appendix, we follow the description of cavity 
method in [3^ and merely generalize it to all graphs 
(general q and unequal size communities). The unini- 
tiated reader is encouraged to peruse [32| in order to fa- 
miliarize him/herself with basic the cavity method (and 
the notations) used that we expand on below. The brief 
introduction below is not self-contained. 

Within the cavity approach, each node passes a mes- 
sage along edges. A message from node i to j is a 
g-dimensional vector of zeros and ones. Node i takes 
the messages from all the other nodes k ^ j connected 
to i and sums them. Then the cavity field defined as 
hi_i.j = X^fe^tj Jki'^k^i is obtained through the above 
process. Finally, node i converts this cavity field into 
a message to j by picking and setting the maximal com- 
ponents in h to one and the rest to zero. The probability 
distribution of messages being sent in the system is de- 
noted as Q''(u). The superscript s denotes a possible 
dependence of this distribution on the index of the pre- 
defined cluster to which the sending node belongs. 

To be consistent with the notations in [s^l, in what 
follows in this appendix (and only in this appendix), we 
will employ the same definition for pi„ and Pout as that 
of [32]. For a fixed cluster A, pf^ = p{A\A) is the condi- 
tional probability that a link starting with a node in A 
also ends in A. Given two (different) clusters A and B, 

B I A 

Pout = P{B\A) denotes the conditional probability that a 
link starting with a node in A would end in B. It follows 
directly from these definitions that, 

Ptn+Y.Pout = ^- (19) 

In particular, when q = 2, there are only two clus- 
ters/states which A, B, and we have -I- p ^},f = 1. 

Following the same calculation process in j32l | , we also 
test the phase transition of community detection in a 
random Bcthe lattice with exact degree k = 3. But there 
is an essential difference that our Hamiltonian does not 
have the constraint of equal-size clusters^ which means 
we do not have the symmetric condition for the order 
parameter Q''(u) = rj^^^ where, c = 1 denotes the "cor- 
rect" component, and wg{1 — c, — 1} denotes the 
number label of a"wrong" component. In our case, u 
now can not be necessarily written as a; = ||u|| — c; we 
now write this as ?7(u)''*"*'^. 

We first discuss the case of g = 2 and then proceed to 
its generalization. 

In systems with two clusters {q ~ 2). the are two 
(Potts) spin states. We will denote these herein as A and 
B (once again, we do so to be consistent with the nota- 
tions in [33], in particular Eqs. (6.60)-(6.63) therein). In 
this case, there are 6 different "order parameters". We 
will denote these as 7]^^, tj^q, rj^^, rj^^^ 77^ and rj^T^. 

In the following, we present the expressions for rj^i^ 
rjiQ and ryj^. The expressions for rj^i. 77^ and i^fi have 
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an identical form with a permutation of the superscripts 
B. 

Vll = iptnVtl +Ptifvilf 



graphs considered therein with an average total coordi- 
nation number per node of (fc) = 16). In, e.g., Fig. 
6.7 of Ref. [s^l, the threshold value of pm is given by 
p^„ « 45%. In Fig. 12 of [III, the critical Zout obtained 
by our greedy algorithm is Zout ~ 9, which corresponds 
to Pin « ^ = 43% « 45%. 



A\B B\2 



+2{ptvto+pTv?o)iptnVti +ptifv?im) 



+2{pt< +ptt^iEi){ptn< +pTv?i)m 

These consist a quadratic system of 6 equations with 6 
variables. This system of equations is numerically solv- 
able. The solutions are continuous with respect to coef- 
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form a generalization of the system studied in 

The above procedure can also be easily generalized to 
system with q > 2 components leading to more terms on 
the righthand side of Eqs. (|20l21l22p . In general, we can 
define an abstract function g: 



{0, 1, 2r \ {(0, 0, ...0)} ^ {0, ir \ {(0, 0, 0)} 
(ai,a2, ...,ag) 



} 



I (ai, 02, Qq) 2 {ai,a2, .. 

|(Lai/2j, La2/2J,..., Lag/2J) 2 G {ai, 02, aj, 
then for any a = (oi, 02, a^) e {0, 1}^ \ {(0, 0, 0)} 
and 1 < « < g, we have the equation 



E 



ipfnV^' 



g(u+v)=a 
u,ve{0,l}''\{(0,0,...,0)} 



Pout jyu') X 



'out Vv 



(23) 



In the above equation (Eq. (|23|)). a denotes the q- 
dimensional incoming message composed of and Is. We 
introduce (u,v) to be any pair of vectors that "sum up to" 
a given vector a, in the sense of g{u+-v) = a. We are able 
to numerically evaluate the order parameter 7y(u)^*"**^ as 
a function of pin- 

From this, we can obtain the phase boundaries of the 
solvable region. Furthermore, to test whether our sim- 
ulation result matches the theory, we perform the same 
accuracy test using our greedy algorithm on ER graphs 
with (k) = 16 and four equal-sized clusters. Our re- 
sult of Fig. 12 in [l3l in Appendix A is consistent 
with the cavity inspired result of Fig. 6.7 in [32j. In 
both plots of the percentage of correctly identified nodes 
in terms of Pin/Zout, the critical value of pj^"*'''"' and 
^out**'^°' for the accuracy drops are the same if we trans- 



Appendix B: Heat Bath Algorithm 

We extend the zero-temperature (greedy) algorithm of 
[1^ to finite temperature via a heat bath algorithm. 
This algorithm allows each node to become a member of 
one community with probability set by a thermal distri- 
bution [ll|. The probability is 



Pa^b 



(24) 



fcr Zout into pin via the relation pi. 



16 -Zo 
16 



- (for the 



Here AEa^b is the change of energy for moving this node 
from cluster a to cluster b, and d runs through all con- 
nected clusters (neighbors) of this node (including the 
case that d = a, i.e., this node remains in cluster a; and 
the case that c? is a newly added cluster, i.e., this node 
becomes a new sole- node cluster). 

The steps of our heat bath algorithm are as follows: 

(1) Initialize the system. Symmetrically initialize the 
system by assigning each node to its own community, 
(i.e., qo ~ N). If the number of communities is con- 
strained to some value g, we instead randomly initialize 
the system into q communities. 

(2) Find the best cluster for node i. Sequentially "pick 
up" each node and scan its neighbor list (include its cur- 
rent cluster and the newly added cluster). Calculate the 
energy change as if it were moved to each connected clus- 
ter. Then calculate the probability for an arbitrary node 
in cluster i to be moved to a connected cluster b using 
Eq. ([M]) . Then we use all the probabilities for different 
j's to determine which cluster to be moved to; i.e., gen- 
erate a random number between and 1, then determine 
which probability range the random number is in, and 
move the node from cluster a to the selected cluster b. 

(3) Repeat step 2 for all nodes in the system. A node 
is frozen for the current iteration once it has been con- 
sidered for a move. 

(4) Merge clusters. Allow for the merger of two com- 
munities together based on the merge probability. To- 
wards this end, we calculate the energy change as if the 
current community is merged with its neighbors. We 
then use Eq. ()24|) to calculate merge probabilities. 

(5) Repeat the above two steps. Repeat step 2 to 4 until 
the maximum number of iterations is reached. 

(6) Repeat all the above steps for s trials. Repeat step 
1-5 for s trials and select the lowest energy result as the 
best solution. Each trial randomly permutes the order of 
nodes in the symmetric initial state. 

The new algorithm is similar to the earlier greedy al- 
gorithm [1, [l3l except for steps (2) and (4). The nodes 
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are moved based on a random process. Thus, the out- 
come may be sometimes sensitive to the initial random 
seed state. As noted within the main text, when the sys- 
tem is within the easy phase, aU seeds lead to the same 
final outcome. However, when the system is within the 
hard phase changing the random seed may significantly 
alter the final result. In such a case, different initial con- 
ditions enable the system to get stuck in different local 
minima (each corresponding to a different partition of 
the system into disparate communities). This is why we 
repeat the procedures 1 — 5 for s trials (usually s is set 
to be 4). The additional trials sample different solutions 
evenly with the symmetric initialization, and it will re- 
duce the dependence on initial conditions. In the unsolv- 
able phase, for any finite number of trials s, the quality 
of the solutions docs not visibly change. 

We should note that the new "heat bath algorithm" 
that we introduced above is different from the commonly 
used "simulated annealing algorithm". (The latter is a 
generalization of the "Metropolis Monte Carlo" proce- 
dure (MMC) 113)). 

Within the conventional MMC procedure, the proba- 
bility for an arbitrary node to be moved in cluster i to a 
connected cluster j is given by min(l, exp(— /3(£^;, — i?a)))- 
This implies that a node i in community a will (with 
certainty) be moved to cluster h if the energy change is 
negative. Such an algorithm precludes for a lower energy 
move (if such a later move will be found later on). By 
contrast, within our "heat bath algorithm", nodes are 
not immediately moved to the first tried clusters if the 
energy change is negative. We compute the probabilities 
of connected clusters. Obviously, the cluster with the 
largest energy decrease would have the largest probabil- 
ity to be the "candidate of absorption" for node i. Thus, 
in contrasting the commonly used MMC procedure and 
our HBA, it seems be easier to get to the lowest energy 
state of the studied system within our algorithm. Our 
procedure allows nodes to explore more energy states in 
each step and better equilibrate. 

The results obtained at low temperature by our HBA 
are very close to the results obtained by the zero temper- 
ature "greedy algorithms" P, [13] . 



Appendix C: Memory effect in In versus noise plot 
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FIG. 14: The plot of In in terms of Pout for system with 
N = 512, q = 40. (Ijv is a normalized variant of mutual 
information, for detailed explanation, see Sec. |Vl) From top 
to bottom, the temperature varies from T = 0.1 to T = 2. 
Note that the curves in panel(a) and (b) show the effect of 
hysteresis at low temperatures. Hysteresis disappears when 
the temperature is sufficiently high, e.g., T = 2 in panel (c). 



In the main text of the article, we provided an exam- 
ple of a hysteresis by decreasing and then increasing the 
temperature of the system (see Fig. [S]). However, exam- 
ples arc not limited to this particular cycle [4^. Other 
ways to see the memory effect include varying the noise 
level. This appendix is devoted to the study of the hys- 
teresis curves in such a case. That is, in this appendix 
we consider the effect of adding external edges between 
disparate communities (i.e., increasing po„t) and then re- 
moving these edges (i.e., decreasing Pout)- We examine 
the accuracy of solutions as a function of noise and see 
whether the two curves coincide. The non-coincidence 



between the two processes will exhibit exactly the same 
memory effect that we earlier reported on by varying the 
temperature. 

Fig. [T3] shows the results of the above experiments at 
three temperatures: T = 0.1, 1 and 2. In the T = 0.1, 1 
systems, the curves with increasing pout and decreasing 
Pout form hysteresis loops. The hysteresis loop in tem- 
perature r = 1 in panel (b) is less significant than its 
counterpart for at T = 0.1 in panel (a). Upon further in- 
crease of the temperature, the hysteresis disappears (as 
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shown in panel (c)). 

The plots in Fig. [T3] have already exhibited decreas- 
ing memory effects as the temperature increased. Thus, 
there must exist a temperature beyond which the effect 
disappears. We investigated the plots at temperatures 
T = 1.1, 1.2, 1.9 (not shown here). The hysteresis loop 
disappeared at about T = 1.7 in line with our other re- 
ported results including the disappearance (at T = 1.6) 
of memory of initial conditions to which we turn to next. 



Appendix D: Memory effect in correlation functions 
with different initial conditions 
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In this appendix, we report on the autocorrelation 
hmctions (Eq. for three different types of initial 

configurations. The conclusion of this appendix is that 
the system may be sensitive to initial conditions. The 
three initializations are denoted as symmetric, random 
and power law distribution. 

• "Symmetric" initialization alludes to an initialization 
wherein each node forms its own community, so there are 
N communities in the beginning (as in step (1) of the 
algorithm outlined in Appendix B). 

• "Random" refers to randomly filling in go communi- 
ties with nodes, where is a random number generated 
between 2 and -j- 

• In the "Power law distribution" N nodes are parti- 
tioned into different communities whose size adheres to 



a power law distribution (Prob 



n 



) with a negative 



exponent j5. In the cases displayed, we set /3 = —1,-2. 

The maximal community size in the true solution is set 
to be 50 and the minimal community size is 8. 

Fig- US vividly illustrates that all the curves with dif- 
ferent initializations separate, at low temperatures, from 
each other even up to times of size t = 10000. The curve 
with symmetric initialization lies on the bottom in panel 
(a). However, as temperature increases, all of the curves 
veer towards each another. The symmetric curve moves 
form the bottom to the top at a temperature T = 1.6 
as shown in panel (b). As temperature increases further- 
more (T = 2), all of the curves overlap in panel (c). 

At a temperature of T = 1.6, systems with different 
initial configurations start to overlapping. Beyond this 
temperature, there is no remaining memory of the ini- 
tial conditions. Furthermore, the relative position of the 
curves become different, which is another indication for 
the lose of memory. The spin temperature at which we 
found the hysteresis loop to disappear in Fig. [Ml T = 1.7, 
nearly coincides with the temperature found here. 

In Fig. [151 the relative positions for the "random" and 
"power law distribution" do not persist: their positions 
change irregularly as temperature varies. This indicates 
that these two are similar to each other- there is no es- 
sential difference between them. However, the curve of 
symmetric initialization lies below the other two until 
the temperature rises up to T = 1.6, which happens in 
all the waiting times that we tested {t^ = 100, t^ — 10 
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FIG. 15: The autocorrelation function as a function of time 
for system A*" = 512, q = 40, pout = 0.4 (above the transition 
point in this system). The waiting time = 100 in all the 
panels. The four curves in each panel represent four differ- 
ent initializations for the studied system. Temperature varies 
from r = 0.2tor = 2. At low temperature, all the curves 
with different initializations separate from each other even 
up to t = 10000 (panel(a)). Then, as T increases, all of the 
curves start moving towards (panel(b)), and finally overlap 
(panel(c)). with each other. 



and = 1000 (not shown here)). This suggests that the 
symmetric initialization differs, in an essential way, from 
the other two initializations. 
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Appendix E: Finite size effects 

111 this appendix, we examine the zero temperature 
transition at p^ut — Pi for systems with different system 
sizes N and community numbers q. 
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FIG. 16: The first transition point pi as a function of q 
(panel(a)) and as a function of A'^ (panel(b)) at zero tem- 
perature. 



In Fig. 1161 we display, at zero temperature, the first 
phase transition point pi (we remind the reader that this 
nose levels marks the first transition point encountered as 
Pout is increased) as a function of the community number 
q (panel (a)) and the system size N (panel (b)). From the 
numerical results that we obtained, we find that pi relates 
linearly 1/q. As seen in Fig. [TBI panel (a), the value of the 
first phase transition point pi in each curve approaches 
zero as q increases. This is consistent with what is ex- 
pected: for a fixed system size, increasing the number of 
spin flavors q introduces a multitude of possible states 
and the system becomes progressively disordered. 

This may also be made analytical via a {1/q) type ex- 
pansion wherein the partition of the Potts model is ex- 
panded in terms of correlations (of having two {ui = aj) 
and then three etc.) connected spins be of the same fla- 
vor. The resulting terms in such an expansion illustrate 
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FIG. 17: The plot of energy versus time for the system A'^ = 
1024, q = 70. We fix the I101S6 3,S Pout — 0.32 which is above 
the first transition point pi = 0.3 (in the "hard" region). Note 
that the energy curves with different temperatures have a 
"crossover" at about t = 1250. Before that, the curve with low 
temperature is always above the one with high temperature. 
After that, except at temperatures of T = 0.1, 0.2 or 0.3, 
the curves of the low temperature systems dip below those 
of the higher temperature ones. The "crossover" property 
shown here is a sign of transition from non-equilibrium to 
equilibrium. 



that increasing q emulates (not too surprisingly increas- 
ing the temperature T). For a system at large N (i.e., 
a system in thermodynamic limit), increasing q renders 
the system progressively less ordered. Thus, in situations 
such as that of an increasing number of communities q 
that scales linearly with the system size N (such that 
the average community size remains constant), the tran- 
sitions become less well defined as — ?► oo. In the fitting 
form below of Eq. ((25| , the saturation of the system phase 
diagram for large N and the relatively quick drop in the 
sensitivity of our results to finite size effects becomes ap- 
parent. 

On the other hand, from panel (b) in Fig. [161 fo^' ^ 
fixed q, when N is small, pi first increases with a very 
steep slow and thence increases very slowly with a nearly 
plateau behavior. We can interpret these data by the 
function pi = a(q) + 67V~^, where a{q) is a constant 
for each q (e.g., a = 0.45 for q = 4, and a{q = 80) = 
0.4). Combining both panels, we can present pi in the 
examined range as a two-variable function N and q, 



pi oc i(a((?) 



4) 



(25) 



Thus, as alluded to above, finite size effects drop and 
features of the system phase diagram (as evidenced by pi 
above) saturate for large N. Thus, in considering limits 
such -/V — > oo while holding the average community size 
n — N/q fixed, we essentially increase q for a system in 
the thermodynamic limit. 
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FIG. 18: The node trajectories in the presence of the weak 
perturbing field for system of size A'' = 24 with q = 4 com- 
munities with a noise level of Pout = 0.1 at a temperature of 
T — 0.01. As discussed earlier, in this system (n;) (for any 
node i) is a three component vector. Each Cartesian com- 
ponent is labeled by a different color (shade) in the above 
figure. The field hi is chosen to be the same as that of the 
preset cluster membership for node i, i.e., n^. The averages 
(ni) in panels (a) to (d) indicate the node location under ap- 
plied fields hi (below each plot). These fields bias the node 
trajectories towards the solution of the system. 



Appendix F: Equilibration times 

In equilibrium, the energy is (of course) constant. The 
system energy is set by its temperature. In this appendix 
we investigate, at different temperatures, the evolution of 
the system from an initial high energy states. In the par- 
ticular results that we provide below, the energy of the 
disparate systems at low temperatures would, at short 
times, naively seem to violate thermodynamic expecta- 
tions. Systems with lower temperature can have higher 
energies than those at higher temperature. The origin 
of this and similar effects is that significant time may be 
required to achieve thermodynamic equilibrium. Within 
the low temperature unsolvable phase the system is out 
of equilibrium. In the hard phase, equilibrium is achieved 
yet it requires long times. 

We now present our results. We set the system size 
(number of nodes) to be iV = 1024 with g = 70 com- 
munities and a value of the noise given by pout = 0.32. 
As such, with this value of pout which is larger than the 
threshold value of pi = 0.3 for this system, the system is 



in the "hard" phase. We examine the system evolution 
with the algorithm time steps in Fig. [T71 In this plot, 
the system has a "crossover" at about t = 1250. Prior 
to that time, the energy always decreases as T increases. 
This refiects the fact times below t = 1250 are not long 
enough for the system to equilibrate. After that, except 
for the cases of T = 0.1, 0.2 or 0.3, the energy turns to 
increase as T increases. Thus, t — 1250 constitutes suffi- 
cient time for equilibration except a few systems at very 
low temperature (that require yet longer times). This 
"crossover" property for system is a sign of the restora- 
tion of equilibrium at sufficiently long times. 

All the curves show a decrease of the energy with time 
until a plateau in reached. When time is not sufficiently 
long, the system is not ergodic and out of equilibrium. 
As seen in Fig. [iTl times t > 2000 are required for lowest 
temperature systems (e.g., T = 0.1, T = 0.2) to equili- 
brate. 



Appendix G: Nodes' trajectory after applying the 
perturbation field 

As mentioned earlier in the text, effective fields may di- 
rect the continuous dynamical system of Section (|VIIip 
towards correct non-trivial solutions. In this brief ap- 
pendix, we outline how this is achieved and provide some 
results. 

The dynamical equation for a node moving under the 
effective field is 



dt Sffi 



(26) 



Similarly to Section (jVIIip yet now with general ap- 
plied fields, we have 



(51nZ{/i,} 



hi 



n^{ii^+|ihi) 



hi 



^(i)i+0h,) 



(27) 



As shown in Fig. [THl if we choose the perturbation 
field to favor a preset community membership for each 
node, i.e., let hi = afii, where a is a small constant 
value, then within the solvable phase the nodes will be 
biased towards the corresponding particular partition of 
the system. 
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